NoSQL operator: filtertable
Runs standard utilities, such as grep(1), sed(1), etc., on a NoSQL
table passed via STDIN.
Usage: filtertable [options --] filter [args]
Options:
--input (-i) 'file'
Read input from 'file' instead of STDIN.
--output (-o) 'file'
Write output to 'file' instead of STDOUT.
--help (-h)
Print this help info.
--no-header (-N)
Suppress header from output table.
--
Marks the end of 'filtertable' options and the beginning
of the actual filtering utility.
'filter': the utility to be run (grep, sed, ...).
'args': any extra arguments and options for 'filter'.
Notes:
This operator reads a NoSQL table via STDIN and runs the specified
'filter' program on the table body. Any options and arguments that
are suitable for the specified filter can be given on the command
line.
This operator can also be used as a pre-processor for other NoSQL
commands, to boost performaces on large tables. For instance, on
a 20000+ record table I got these results:
time getrow 'Field ~ /keyword/' < bigtable.rdb
real 0m0.400s
user 0m0.350s
sys 0m0.050s
time filtertable grep keyword < bigtable.rdb | getrow 'Field ~ /keyword/'
real 0m0.079s
user 0m0.030s
sys 0m0.040s
i.e. a performance improvement of 500% !
Possible uses for this operator are limited only by your imagination
(and by the availability of suitable unix filters). For instance, it
can be used as a better/faster alternative to 'getrow' if all you need
is to pick a list of primary keys from a NoSQL table, like this:
filtertable -- grep -f pick_list < input_table
where 'pick_list' is a file containing one key per line, each prepended
by a caret (^) and followed by a TAB, to make sure that it matches the
table leftmost field, that is the primary key.
Likewise, if you want to delete a given set of keys from the table you
can simply take advantage of sort(1) '-v' option:
filtertable -- grep -v -f delete_list < input_table